A Multiplatform Chemometric Approach to Modeling of Mosquito Repellents

181

Table 9.3: Statistical parameters of the established linear QSAR models for prediction of

Rindex of the set of natural and synthesized compounds

Regression model

R2

R2adj

R2cv

RMSE

F

ULR

0.6063

0.5845

0.5306

24.3495

27.7

MLR1

0.7289

0.6971

0.6552

20.7907

22.9

MLR2

0.8150

0.7804

0.7192

17.7024

23.5

that the introduction of additional predictor (increase in the number of variables) variable

improves the model’s quality more than it would be expected by chance.

Another confirmation of the quality of QSAR models is comparison between experimental

and predicted values, as well as the analysis of amplitude and randomness of residuals

(absolute differences between the experimental and predicted values). In an ideal case, the

relationship between experimental and predicted values is described by R2 = 1, while the

absolute values of the residuals are equal to zero. Quite extensive validation approaches,

including the cross-validation, have been applied in the studies by De et al. 2018, Natarajan

et al. 2008 and Wang et al. 2017.

9.3.5

Chemometric classification methods as a platform for repellents selection

9.3.5.1

Cluster analysis

Cluster analysis is one of the most favored chemometric pattern recognition techniques.

In the modeling of the compounds with repellent activity it can be applied for the purpose

of grouping of the compounds based on their molecular of bioactivity properties. The clus-

tering can be carried out as agglomerative clustering (each object observed individually

then gradually objects are merged into one group) or as division clustering (two groups

are being formed from one and then the next two from them). Since there is a building of

hierarchy of clusters, this analysis is also known as hierarchical cluster analysis (HCA).

The results of HCA are usually presented in a visual form known as dendrogram (Figure

9.5).

The dendrogram presented in Fig. 9.5 shows the grouping of natural repellents and

novel compounds synthesized by Thireou et al. 2018 in the space of their calculated physic-

ochemical descriptors, including: boiling point (BP), melting point (MP), critical temper-

ature (CT), critical pressure (CP), critical volume (CV), Gibbs energy (GE), lipophilicity

(logP), molar refractivity (MR), total polar surface area (tPSA), calculated lipophilicity de-

scriptor (ClogP) and calculated molar refractivity (CMR). All the descriptors were calcu-

lated by ChemBioDraw Ultra 13.0 program (PerkinElmer Inc.). The dendrogram indicates

that some of the synthesized compounds are quite similar in the space of the calculated

molecular features with the natural repellents. The closest similarity is between n-butyl

cinnamate and Syn7 compound, as well as ethyl cinnamate and Syn4 compound, whose

structures are presented in Figure 9.6. Also, on the basis of the presented results of HCA

analysis in Figure 9.5 it can be seen that there are two main clusters: one with the group

of seven synthetic compounds together with lauric acid, and other one with the rest of the

compounds.